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Abstract 

This is primarily a pedagogical paper. The paper re-visits some well-known quantum 
information theory inequalities. It does this from a quantum Bayesian networks 
perspective. The paper illustrates some of the benefits of using quantum Bayesian 
networks to discuss quantum SIT (Shannon Information Theory). 
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1 Introduction 



For a good textbook on classical (non-quantum) Shannon Information Theory (SIT), 
see, for example, Ref.[T] by Cover and Thomas. For a good textbook on quantum 
SIT, see, for example, Ref.[2] by Wilde. 

This paper is written assuming that the reader has first read a previous paper 
by the same author, Ref . [3] , which is an introduction to quantum Bayesian networks 
for mixed states. 

This paper re- visits some well-known quantum information theory inequalities 
(mostly the monotonicity of the relative entropy and consequences thereof). It does 
this from a quantum Bayesian networks perspective. The paper illustrates some of 
the benefits of using quantum Bayesian networks to discuss quantum SIT. 



2 Preliminaries and Notation 

Reading all of Ref. [3] is a prerequisite to reading this paper. This section will intro- 
duce only notation which hasn't been defined already in Ref. [3] . 
Let 

Sx,y = X Sy= {{x,y) : X e Sx,y e Sy} , (1) 
T-l-x,y = 'Hx®T-iy= span{\x) x\y) y : X e Sx,y e Sy} . (2) 
Suppose {Px,y {x,y)}\/x,y G pd{Sx,y)- We will often use the expectation op- 
erators Ex = J2xPi^)' E^,y = J2x,yPi^^y)^ and Ey\x = Y.yP{y\^)- Note that 
-^x,y — ExEy^x- Let 

Note that ExPix : y) = EyP{x : y) = I. 

We will use the following measures of various types of information (entropy): 

• The (plain) entropy of the random variable x is defined in the classical case by 

H{x) = EJn^^, (4) 

which we also call Hp^{x), H{P{x)}\/x, and H{Px)- This quantity measures 
the spread of Px- The quantum generalization of this is, for G dm{'Hx), 



S{x) = -tlxiPx^l^Px) , (5) 

which we also call Sp^ (x) and S{px)- 

One can also consider plain entropy for a joint random variable x = {xi, x 2). 
In the classical case, for Px^,x^ G pd{Sx^,x^) with marginal probability dis- 
tributions Px^ and Px^i one defines a joint entropy H^x^, X2) = H{x) and 
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partial entropies H{xi) and H{x2)- The quantum generalization of this is, for 
P^i,^2 ^ dm{'Hx^,x^) with partial density matrices p^^ and px^, a joint entropy 
X2) = S{x) with partial entropies S{xi) and S{x2)- 

• The conditional entropy of y given x is defined in the classical case by 



H{y\x) = ^,,,ln-^^ (6a) 
= H{y,x)-H{x), (6b) 

which we also call Hp^ ,y{y\x). This quantity measures the conditional spread 
of y given x. The quantum generalization of this is, for px,y £ dm{l-Lx,y)-i 

S(y\x)^S(y,x)-S(x), (7) 
which we also call Sp^ y{y\x). 

The Mutual Information (MI) of x and y is defined in the classical case by 



H{y:x) = Ex,ylnP{x:y)^ExEyP{x:y)\nP{x:y) (8a) 
= H{x) + H{y)-H{y,x), (8b) 

which we also call Hp^ ^{y : x). This quantity measures the correlation be- 
tween X and y . The quantum generalization of this is, for px,y G dm{T-Lx,y ), 

S(y:x)^S{x) + S{y)-S{x,y), (9) 
which we also call Sp^ y {y : x). 

• The Conditional Mutual Information (CMI, which can be read as "see me") of 
X and y given A is defined in the classical case by: 



P(3:.».A)P(A) 
= ^-^'V(x.A)P(,.A) ('""^ 

= i/(x| A) + i/(y| A) -i/(^, x| A) , (10c) 

which we also call Hp_^ ^{y : a; | A). This quantity measures the conditional 
correlation of x and y given A . The quantum generalization of this is, for 

Px,y,\ e dm{'Hx,y,x), 
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Siy--x\X) = S{x\X) + S{y\\)-S{y, x| A) , (11) 
which we also call Sp^ ^ ^{y : x\\) 

• The relative information of P G pd{Sx) divided by Q G pd{Sx) is defined by 

D{p(x)//g(x)K = J2 P(^) 1^ ' (12) 

which we also call D{P^//Qx)- The quantum generalization of this is, for 

P'(Px//o-x) = tr^ (p^(lnp^ - Ino-^)) . (13) 

Note that we define entropies using natural logs. Our strategy is to use natural 
log entropies for all intermediate analytical calculations, and to convert to base-2 logs 
at the end of those calculations if a base-2 log numerical answer is desired. Such a 
conversion is of course trivial using loggX = and ln2 = 0.6931 

The notation @p{ } will be used to indicate that all quantum entropies S{-) 
in statement J-" are to be evaluated at density matrix p. For example, @p{ S{a) + 
S{h\c) = } will stand for Sp{a) + Sp{b\ c) = 0. 

Define 

= ^ \x)x{x\x ■ (14) 

Define 1^ to be the A^-tuple whose components are all equal to one. 

Recall from Ref.[3] that an amplitude {A{y\x)}\fy^x is said to be an isometry 

if 



y 

for all X, x' E Sr ■ 



h.c. 



s:' (15) 



3 Monotonicity of Relative Entropy (MRE) 

In this section, we will state the monotonicity of the relative entropy (MRE, which 
can be read as "more") and derive some of its many consequences, such as MI > 0, 
CMI > 0, and the data processing inequalities. 
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3.1 General MRE Inequality 

Claim 1 Suppose {Pa {a)}\/aes a ^^^^ {<5a (a.)}vae5a are both probability distributions . 
Suppose {Tb\a{,b\ci)}\f(b,a)eSt a '^^ ^ transition probability matrix, meaning that its en- 
tries are non-negative and satisfy X]fe^6|a(^k) = 1 /^'^ ^^?/ a E Sg.- Then 

D{Tb\aPa//Tb\aQa) < D{Pa//Qa) , (16) 

where we are overloading the symbol Tf,\a so that it stands also for an Nf, x A^^ 
matrix, and we are overloading the symbols Pa,Qa so that they stand also for Na- 
dimensional column vectors. 

proof: 



D(.P//Q) = E^('')'"^ (17") 



b 



= D{TP//TQ) (17d) 

Eq.f ll7cl) follows from the so called log-sum inequality (See Ref. P). 
QED 

Recall from Ref.[3] that a channel superoperator 7^|a is a map from dm{'Ha) 
to dmiTib) which can be expressed as 



where the operators : T-ia — ?■ Hb, called Krauss operators, satisfy: 

Y,KlK, = l. (19) 

Ref. |3j explains how a channel superop can be portrayed in terms of QB nets as a 
two body scattering diagram. 

Claim 2 Suppose Pa,<7a ^ dmiTig,) and Tb\g '■ dm{l-ia) — dmiTib) is a channel 
superop. Then 

D{Tb\a{Pa)IITb\a{(yg)) < D{p J j O g) (20) 
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proof: See Ref.[2] and original references therein. 
QED 

Note that Eq. (fT6!) is a special case of Eq. (!20|) . Indeed, if 71 1 a has Krauss 
operators {K^}\/^, then let 

Pb=J2K^PaKl, (21a) 

at=J2^>^^^K- (21b) 
Assume pa and aa can both be diagonalized in the same basis as follows 

Pa =5Zn«)« ]^«(«)[ h.c. ] , (22a) 

a 

(Ta=J2[\a)a]Qaia)[h.c.]. (22b) 

a 

Likewise, assume that pb and can both be diagonalized in the same basis. Thus 
assume Eqs. (|2^ . but with the letters a's replaced by 6's. Then Eqs. (l?Ti) reduce to 

Pk=Tb\aPa, (23a) 



Qb=Ti,laQa, (23b) 

where 

Tbj^ib\a) = J2\m,\a)\' (24) 

for all a G Sa and b G Sb- Clearly, this Tb\a satisfies ~ 1- Therefore the 

quantum MRE with diagonal density matrices is just the classical MRE. 



3.2 Subadditivity of Joint Entropy (MI>0) 

For any random variables a,b, 

H{a,b) < H{a) + H{b) . (25) 

This is sometimes called the subadditivity of the joint entropy, or the independence 
upper bound on the joint entropy. It can also be written as (i.e., conditioning reduces 
entropy) 

H{b\a) <H{b) , (26) 

or as (MI> 0) 
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H{b:a)>0. (27) 
Claim 3 (MI > 0) For any p e dm{lia,h), 

S{a,b) <S{a) + S{b) , (28) 

or, equivalently, 

S{b\a) <S{b) , (29) 

or, equivalently, 

S{b:a)>0. (30) 

proof: Apply MRE with T = tr„. 

= D{p,//p,) < D{pa,b//PaPb) ^S{a:b). (31) 

QED 

3.3 Strong Subadditivity of Joint Entropy (CMI>0) 

For any random variables a,b,e, 

H{a,b\e) <H{a\e) + H{b\e) . (32) 

This is sometimes called the strong subadditivity of the joint entropy. It can also be 
written as 

H(b\a, e) < H(b\e) , (33) 

or as (CMI > 0) 

H{b:a\e)>0. (34) 
Claim 4 (CMI >0) For any p e dm{na,b,e), 

S{a,b\e) <S{a\e) + S{b\e) , (35) 

or, equivalently, 

S{b\a,e) <S{b\e) , (36) 

or, equivalently, 

S{b : a|e) > . (37) 
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proof: Apply MRE with T = tr^ to get 



Then note that 



S{a : b\e) = D{pa,b,e//pa,e^) - D{pb,e//pe^) ■ (39) 



QED 



3.4 Araki-Lieb Inequality 

Claim 5 (Araki-Lieb Inequality ) For any p G dm{'Ha,b) , 

\S{a) ~ S{b)\< S{a,b) . (40) 

or, equivalently, 

r -S{b)<S{b\a) 

\ -S{a) < S{a\b) ■ ^ > 

proof: Consider a pure state Pa,b,e G dm{'Ha,b,e) with partial trace Pa,b- Then 

S{b,e) <S{b) + S{e) . (42) 

According to Claim [T6| 5'(6, e) = ^(a) and 5'(e) = S{a, b). These two identities 
allow us to excise any mention of e from Eq. fj42|) . Thus Eq. fH2|) is equivalent to 



S{a) <S{b) + S{a,b) , (43) 

which immediately gives 

- S{b) < S{b\a) . (44) 

QED 

Note that classically, one has 

(a) (6) 

< H{b\a) < H{b) . (45) 

Inequality (a) follows from the definition of H{b\a), and (b) follows from Ml> 0. 
For quantum states, on the other hand, 

(a) (b) 

- Sib) < Sib\a) < Sib) , (46) 
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or, equivalently, 



\S{b\a)\<Sib). (47) 
Inequality (a) follows from the Araki-Lieb inequality, and (b) follows from MI> 0. 

3.5 Monotonicity (Only in Some Special Cases) of Plain En- 
tropy 

Consider the two node CB net 



. (48) 



For this net, P5 = Ph\aPa- Assume also that Pb\a is a square matrix (i.e., that 
Na = Nb) and that it is doubly stochastic (i.e., that Ylb-^i^l^) ~ ^ ^'^'^ 
^^P(6|a) = 1 for all b. In other words, each of its columns and rows sums to one.). 
Then the classical MRE implies 

1 AT -iN 

DiPb//-)<DiPj/-), (49) 

where N = Na = Nb- (The reason we need ^a-P(&|a) = 1 is that we must have 
Pb\ai^ — i^)- Next note that for any random variable x, 

H{P^) = ln(Arj - D(P^//^) . (50) 



X 



Thus, 



H{b)>H{a). (51) 



Thus, when Pb\a is square and doubly stochastic, Pb has a larger spread than Pa. 
This situation is sometimes described by saying that "mixing" increases entropy. 

An important scenario where the opposite is the case and Pb has a smaller 
spread than Pa is when b = /(a) for some deterministic function f : Sg, ^ Sb- 
In this case, P{b\a) — S{b,f{a)) (clearly not a doubly stochastic transition matrix). 
Thus 

H(b,a)^H(b\a) + H(a)^H(a) (52) 

Also 

H{b, a) ^ H{a\b) + H{b) (53) 

Hence 

H{a\b) + H{b) ^ H{a) (54) 
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But H{a\b) >0. Hence 

H{h) = HU\a))<H{a). (55) 

Loosely speaking, the random variable / ( a ) varies over a smaller range than a (unless 
/() is a bijection), so /(a) has a smaller spread than a. 

Claim 6 Suppose pg. € dm{l-ia) and Th\a '■ dm{l-ia) — )■ dm{'Hb) is a square (i.e., 
Na = Nb) channel superop such that lb = Tb\a{Ia)- Then 

S{rb\a{Pa))>S{pa) • (56) 

proof: Let pb = Tb\a{pa)- Then MRE implies 

D{Pk//^-^)<D{Pa_//^-^), (57) 
where N = Na = Nb ■ Now note that for x = a,b, 

S{p^_) = HN^_)-D{pJ/^). (58) 

QED 

3.6 Entropy of Measurement 

Applying the cl(-) operator to a node ("classicizing" it) is like a "measurement". 
Thus, the following inequality is often called the entropy of measurement inequality. 

Claim 7 For any p^ G "H^ and orthonormal basis {\x) x}\/x, 

H{{x\p^\x)}^,>S{p,), (59) 

or, equivalently, 

Sp,Jx,i)>S,Ax). (60) 

proof: : This is a special case of Claim [6] with a = x, b = x^i, and T = cl^. 
QED 

Note that one can prove many other similar inequalities by appealing to MRE 
with T = cl^. For instance, for any pa,b &'Ha,b, 

Sp.^^Jb, a^i) > Sp^^^ib, a) , (61) 

and 

S,,,,Jb:a,,)<Sp,Jb:a). (62) 
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3.7 Entropy of Preparation 



An ensemble {\/Wj\ipj)}\/j for a system can be described as a preparation of the 
system. Thus the following inequality is often called the entropy of preparation 
inequality. 

Claim 8 Suppose the weights {'Wj}\/j^Sj are non-negative numbers that sum to one, 
and x}\/j£Sj are normalized states that span % X ■ Let 



^Wj[ ] [ h.c. ] =tr^(p^,j; 



(63) 



where pxj € dm{l-ix,j) is a pure state (a "purification" of p^)- Then 

H{w,}y,>S{p^) , (64) 



or, equivalently. 



(65) 



The inequality becomes an equality iff the states x}'ij o^^e orthonormal, in which 
case the weights {wj}\/j are the eigenvalues of px- 



proof: Let 

where 



where 



and 



^,3 



A{x,j) = A{x\])A{j) , 



(66) 



(67) 



(68) 



Then 



A{x\j) = {x\^j), A{j) = 



('^) (b) 



(69) 



(a) follows from the entropy of measurement inequality (Section 13. 6p . Note 
that (a) becomes an equality iff the states {|'?/'j)z}vi are orthonormal. 
(6) follows because px,j is a pure state. 

QED 



11 



3.8 Data Processing Inequalities 

Consider the following CB net 

Classical MRE with T = Pc\b implies 

D{P,,a//PcPa) < D{P,,a//PbPa) • (72) 



Thus 



H{c : a) < H{h : a) . (73) 




Eq.( !73|) is called a data processing inequahty. 
Next consider the following CB net 

(^-^(^ (74) 

where node y is deterministic with P{y\x) = 6{y,f{x)). The data processing in- 
equality applied to graph Eq.f l7i|) gives 

H{f{x) : a) < H{x : a) , (75) 

and 

H{b : x) < H{b : f{x)) . (76) 
Note that for any random variable z , one has 

Hifiz) : z) = HU\z)) - Hifiz)\z) = HU\z)) . (77) 
Combining Eqs.(I75]) and dTTj) yield£l 

H{f{x)) = H{f{x) : a)|,_^ < H{x : a)U^^ = H{x) . (78) 

Now let's try to find quantum analogues to the classical data processing in- 
equalities. To do so, we will use the following QB nets. 
For i > 1, let /3j = {b^, e^). Define 



^ What we really mean by the limit a — >■ x is that P{x\a) — 5^. Taking 6 — > a; in Eq. ([76l 
would not work because b and x are not adjacent to each other whereas a and x are. 
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(79) 



and 



Aj) 



For example, 



[ h.c. ] 



(80) 



(2) 



/3, 



5i 



{ h.c. ] 



(1) r (2) 1 



and 



p??, = erase, Jp-,, J 



(1) 




,^1 



.5' 



(a) [h.c. ] 



[ h.c. ] 



f82) 



(83) 



Note that the operations of tracing versus erasing a node from a density matrix (and 
corresponding QB net) are different. They can produce different density matrices. 

Let h_Q= h_. For j > 1, assume the amphtude y4(/3j |6j_i) comes from a channel 
superoperator Ti3j\bj_i- Hence, it must be an isometry. 

Some quantum data processing inequalities refer to a single QB net, whereas 
others refer to multiple ones. The next two sections address these two possibilities. 



3.8.1 Single-Graph Data Processing 



Claim 9 For p 



(2) 



^ given by the QB net of Eq. ( fglj) . 



S (2 



M2d ■ i± 



(a) 

: a) < S {2) 



Mid ■ i± 



(6) 

: a) < S (2) 



(Note that the e • have been traced over.) 
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proof: Inequality (6) is just a special case of inequality (a). Inequality (a) can be 
established as follows. 



= S{bi: a) — [S{b^, b2 '■ Ql) ^ S{bi : a|&2)] (85a) 

= -Sik^: a\b^) + S{b^: alb^) (85b) 

= S{b-^ : alb^) (85c) 

> } . (85d) 

(I85cp : Follows because 5(62 • Qilki) = since 61 is classical and at the middle of a 
Markov chain. See Claim [T3l 

f l85dl) : Follows because CMI> 0. 
QED 

3.8.2 Multi-Graph Data Processing 

The following claim was proven by Schumacher and Nielsen in Ref . [5] . 
Claim 10 For p^^J b b b a 9''''^^^' by the QB net of Eq. ^8^), 

-i'"''-2'^ii'-'- 

(a) (b) 

S(s, ib,:a)<Sm {b2:a)<Sa) (b, : a) . (86) 

(Note that the Cj have been traced over.) 
proof: 

Inequalities (a) and (6) both follow from MRE because 

and 
QED 
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4 Hybrid Entropies With Both Classical and Quan- 
tum Random Variables 

4.1 Conditioning Entropy on a Classical Random Variable 

Prom the definition of H{b\a), it's clear that H(b\a) > 0. On the other hand, 
S{b\a) can sometimes be negative. One case where S{b\a) is guaranteed to be 
non-negative is when the random variable being conditioned on is classical. 

Claim 11 For any pa,b £ dm{l-La,b), 

^P,.,J^|a,,)>max(0,5,,,j6|a)). (89) 

proof: By MRE with T = cU , 

S{b : aa) < S{b : a) . (90) 

But 

5(6 : aj = 5(6)-5(6|aj, (91) 

and 

S{b : a) ^ S{b) - S{b\a) . (92) 

Hence 

S{b\aa)>S{b\a) . (93) 
One can express pb,a^i as 

= ]P6|a[h.C. ] (94) 

a 

where {P(a)}va £ pd{Sa) and pb\a £ dm{Hb) for all a. Therefore 

^iPb,aJ = -trfe ^P(a)pft|aln (P(a)p6|„) (95a) 

a 

= H{P{a)h, + J2Pio>)S{pb\a). (95b) 

a 

Hence, 

^p,„,(&| a,,) = 5(p,,^J - S{p^J = Y,Pia)S{pbja) > . (96) 

a 

QED 
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4.2 Clone Random Variables 



We'll say two random variables are clones of each other if they are perfectly correlated. 
Classical and quantum clone random variables behave very differently as far as entropy 
is concerned. In this section, we will show that two classical clones can be merged 
without changing the entropy, but not so for two quantum clones. 
Consider the following CB net 

0^©—® , (97) 

where 

P{a'\a) = 6^ . (98) 

Since P{a',a) = P{a)6^ , one gets 

H{a, a') = H{a) = H{a') , (99) 



H{a\a') = H{a'\a) = , (100) 
H{a : a') = H{a) = H{a') , (101) 
H{b, a, a') = H{b, a) = H{b, a') , (102) 



H{b, a\a') = H{b\a) = H{b\a') 



(103) 



All these results can be described by saying that the classical clone random variables 
a and a' are interchangeable and that often they can be "merged" into a single 
random variable without changing the entropy. 

Quantum clone random variables, on the other hand, cannot be merged in 
general. For example, for a general state Pa,a', one has S{a, a') ^ S{a), even if 



{a,a'\pa^a'\a,a') a 5^ 
for all a, a' G Sa- For example, when 



Pa, a' 



la) a 
la) a' 

= and S{a 



(104) 

[ h.c. ] , (105) 

= S{a') ^ 0. Hence, 



Eq.f ll04p is satisfied. However, S{a, a') 
S{a, a') 7^ S{a). 

Similarly, for a general state Pb,a,a', S{b, a, a') ^ S{b, a). For example 

when 
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Pb,a,a' 



\b)k 

\a)a' 



[ h.c. ] 



(106) 



Eq.f ll04p is satisfied. However, S{b, a, a') = and S{b, a) = S{a') ^ 0. Hence, 
S{b, a, a') 7^ S{b, a). 



Claim 12 Suppose 



\a)a 
\a)a' 



Pb\a[ h-C. ] 



(107) 



where {P(a)}va G pd{S a) and pb\a ^ dm{l-L^) for all a. Then 

S{b, a, a') = S{b, a) = S{b, a') , 



and 



S{b,a\a') = S{b\a) = S{b\a') 



(108) 
(109) 



proof: 

For any density matrix p with no zero eigenvalues. In p can be expressed as an 
infinite power series in powers of p: 



Inp = '^Cjp' , 

j=0 

for some real numbers cj that are independent of p. 
Note that 



(110) 



\a)a 
\a)a' 



Pl\a[ h.c. ] 



5^P^(a)[ |«). ]p|| J h-C. ] 



Pl,a 



(111a) 

(111b) 

(111c) 



Thus, the operations of tr^,/ and raising-to-a-power commute when acting on Pb,a^i,a'^^- 
(This is not the case for Pb,a,a' given by Eq.f ll06p ). 
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Finally, note that 



S{b,a,a') = -^^b,a,a' {Pb,a,a''^T^Pb,a,a') (112a) 

= 5(6, a) - 5(6, a') . (112d) 

QED 

4.3 Conditioning CMI On the Middle of a Tri-node Markov- 
Like Chain 

We will refer to a node with 2 incoming arrows and no outgoing ones as a collider. 
Let's consider all CB nets with 3 nodes and 2 arrows. These can have either one 
collider or none. 

The CB net with one collider is 



e 



{k) (113) 



For this net, P{a,b\e) ^ P{a\e)P{b\e) so H{a : b\e) ^0. 

There are 3 CB nets with no collider: the fan-out (a.k.a. broadcast, or fork) 
net, and 2 Markov chains (in opposite directions): 



® , (114) 
k) , (115) 



k) . (116) 

We will refer to these 3 graphs as tri-node Markov-like chains. For all 3 of these 
nets P{a,b\e) = P{a\e)P{b\e) so H{a : 6|e) = 0. In this case we say a and b are 
conditionally independent (of e). 
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Claim 13 Let 



fan— out 



U)'^0'fio 



(W) Uo) (/3 



0- 



anc? 



Markov _ ^^oio'^O'/^o 







®- 



[ h.c. ] , (117) 








— Oy 



@ © C^i 



fan— out _„ „Markov 



[ h.C. ] 



(118) 



With pa,b,e equal to either p^^™* or p'^^f;^^, 



S, 



.a : h\ea) = . 



(119) 



proof: 

At the end of this proof, we will show that for both of these QB nets, Pa,b,e^i 
can be expressed as 



Pa,b,e^l ^^P(e)[ \e)e ]Pa\ePb\e[ h.C. ] , 



(120) 



where {P(e)}ve G pd{Se), and pa\e G dm{l-La), Pb\e G dmlUb) for all e & Sg- Let's 
assume this for now. Then 



S{a,b,e^) = -tr a^b^ {P{e)Pa\e Pb\e^T^{P{e)Pa\e Pb\e)} (121a) 

e 

= H{P{e)}ye + Yl Pi^)[S{p^\e) + ^(P6|e)] • (121b) 
e 

Hence 



S{a,b\e^) = ^P(e)[5(p^|e) +5(p6|e)] . 



(122) 
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One can show in the same way that also 

S{.a\e^) = ^P{e)S{pa\e) , 

e 

and 

S{b\e,i) = Y,Pie)S{p,ie). 

e 

Thus 



S{a : b\e^i) = S{a\e^i) + S{b\e^i) - S{a, 6|e^,) = 



Now let's show that Pa,b,e^i has the form Eg. (11201) for both QB nets. 
For the fan-out net, 



fan— out 



ao,eo,/3o "i,<;i,/3l e 



X;a^(a|e,ao)v4(ai|a)A(ao)|a)a 
j:,Aib\e,(3o)A{f3,\b)A{Po)\b)t 
A{e\eo)A{e,\e)A{eo)\e)e 



[ h.c. ] . 



Set 



Pa\e = C'ale ^ [ Xla ^(^1^' "o)^(ttl |a)^(tto) !«) a ] [ h.C. ] 



and 



Pk\e = Cbje J2 [ EbMb\e,Po)A{P,\b)A{f3o)\b), ] [ h.c. ] . 

/3o,/3i 

For X = a,b, the constant Cx\e depends on e and is defined so that tr^p^| 
For the Markov chain net, 



Markov 



E E E 

ao,<;o,/3o oi,ei,/3l e 



X;a^(a|e,ao)^(ai|a)^(a;o)|a)a 
Y.,A{b\P,)A{P,\b)A{(5o)\b)b 
A{e\b, eo)A(ei|e)v4(eo)|e)e 



[ h.c. ] 



Set 



Pa\e = Ca\e^ [ ^ A{a\e , ao) A{ai\a) A{aQ)\a) a ] [ h.c. ] 



ao,ai 



and 



Ph\e = Cb\eY^ Yl 
eo,ei /So, Pi 



A(6|/3o)A(/3i|6)A(/3o) 
A{e\b,eo)A{e,\e)A{eo) ''^^ 



[ h.c. ] , 
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where again, Ca\e and Cb\e are defined so that the density matrices pa\e and pb\e have 
unit trace. 

For both graphs, if we define 



P(e) = tra_fc (e|pa,fe,e |e) 



(132) 



then Eq.f ll20p is satisfied. 
QED 



4.4 Tracing the Output of an Isometry 

This section will mention an observation that is pretty trivial, but arises frequently 
so it is worth pointing out explicitly. 

Consider the following density matrix 



Pc,b,a 



[ h.c. ] 



A{c\b)\c), 
A{a)\a)a 



[ h.c. ] 



Assume that A{c\b) is an isometry. Then 



Aib\a)\b), 
A{a)\a)a 



[ h-c. ] = Pb^„ 



and 



(133) 

(134) 

@P„t,AS{b,a) = S{b^i,a)}. (135) 

Thus, we observe that tracing over all the output indices of an isometry amplitude 
embedded within a density matrix converts the inputs of that isometry amplitude 
into classical random variables. 

Next consider the following density matrix. 



[ h.c. ] 



Aic\b)\c), 
E,.,A(&|a)A(a)|a), 



[ h.c. ] . 

(136) 

(137) 

@P„^,J5(a) = 5(aJ}. (138) 

Thus, we observe that two isometrics joined by slashed variables behave as if they 
were just one isometry. 



Assume that both A{c\b) and A{b\a) are isometries. Then 

a 

and 
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4.5 Holevo Information 

Suppose {P{x)}\/x e pd{Sx) and pq\x G dm{T-Lq) for all x. Set 

Pq = ^xP^ix = = ^ [ h-c. ] . (139) 

X X 

Then the Holevo information for the ensemble {P{x), Pq\x}\/x is defined as 

Hol{P{x),Pg\x}\/x = S{Expq\x) - ExS{pq\x) = [S,Ex]pq\x ■ (140) 

Claim 14 Let 

Pq,x = Pq,x,, ^^Pix)[ \x)x ]Pq\x[ h-C. ] , (141) 

X 

where {P{x)}\/x £ pd{Sx) and Pq\x & dm{Hq) for all x. Then 

Hol{Pix),Pqjxhx - S,^_^^_Jq : x^) (142) 
Thus, the Holevo information is a MI with one of the two random variables classical. 
proof: 

Spg.^Jg^ ^ci) = -trg ^{P(x)pq|^ln (^P(x)pgi^)} (143a) 

X 

= H{P{x)}^x + ExS{pqjx) . (143b) 

Hence 

@P„.AS{q: x,i) = S{q)-S(q\x,i) (144a) 

= S(q)-ExS(pq^x) (144b) 
= >S(£;,p,|,)-£;,5(p,|,)}. (144c) 

QED 

5 Holevo Bound 

In this section we prove the so called Holevo Bound, which is an upper bound on the 

accessible information. The accessible information is a figure of merit of a quantum 
ensemble. The upper bound is given by the Holevo information. The proof of the 
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Holevo Bounclfl that we give next, it utilizes and therefore illustrates many of the 
concepts and inequalities that were introduced earlier in this paper. 

Consider a density matrix pq expressible in the form Eq. fll39p . It is useful to 
re-express Pq using the eigenvalue decompositions of the density matrices Pq\x- For 
some Q with Sq = Sg, suppose the eigenvalue decompositions of the pq\x are given 
by " ~ " 



for all X. Define 



Pg\x = ^^Q\x[ \>'Q\x)g ] [ h.C. ] 

Q 



A{x) = ^/P{x) 



(145) 



(146) 



and 



A{Q\x) 



(147) 



Then 



Pc 



A{q\Q,x) = {q\XQ\x) ■ 



E [ EgA{q\Q,x)A{Q\x)A{x)\q)q_ ] [ h.c. ] . 



:i48) 



(149) 



It is useful to find a purification of p^; that is, a pure state pq^r such that 
Pq = ti r{pq ,r)- One possiblc purification of is given by 



Pq,Q, 



j:,,QMg\Q,x)\q)q 

A{Q\x)\Q)q 
A{x)\x)x 



[ h.c. ] 




[ h.c. ] 



(150) 



with r = {Q, x^i). 

Let Sq = Sq = Sq, and Sy = Sy for j = 1,2,3. Suppose pg^ G pd{'Hq) 
is defined by Eq.f ll39p with q replaced by g.^^. Suppose Pg^ is transformed to p'q G 
dm{'Hg) by a quantum channel with Krauss operators {Ky}\/y. Thus 



(151) 



^ The proof given here of Holevo's original result (Ref.[Q) is very similar to the one first given 
by Schumacher and Westmoreland in Ref.[7]. 
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As explained in Ref.[3], the Krauss operators {Ky}\fy can be extended to a unitary- 



matrix Uq^y. Let 



{Q2\Ky\qi) 



u 



q,y 



|0). 



for all qi,q2 € Sq and y E Sy. Now we can define 



(152) 




where 



MQ2,y2\qi,yi = 0) 



u 



for all gi, g2 e Sq and y2 E Sy. 

Note that i?o y ^ satisfies i?o = p' . 



q,y 



\Ql)q, 



[ h.c. ] 



(153) 



(154) 



Claim 15 If Pq_^,Q ,x^, is the QB net of Eg. /1150\) with q replaced by q ^, and Rq^^y^^x^^ 
is the QB net of Eg. U53\) . then 



proof: 



^Ry2'^cSy2 ■ ^ci) < Hol{P{x),pq^\x}wx ■ 



= Hol{P{x),Pq^\xhx ■ 



(155) 



(156a) 
(156b) 
(156c) 



(I156ap : Follows because of MRE with T = tr^^. 

( Il56bl) : Follows from the multi-graph data processing inequalities. 

f ll56cp : Follows from Claim [TH 
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QED 

Define the accessible information Acc of the ensemble {P{x), pq \x}yx and any 
channel with Krauss operators {Ky}\/y by 



Acc{P{x),pg ixjwcc = max Sr {y : x^i) . 



Claim [T5] implies that 



(157) 



(15^ 



Acc{P{x),p g^lxhx < Hol{P{x),pgjx}yx ■ 

A Appendix: Schmidt Decomposition 

In this appendix, we define the Schmidt decomposition of any bi-partite pure state. 
Consider any pure state \ip) a,b G 'Ha,b- It can be expressed as 



a, I 



(159) 



'Ea,bMa,b)\a)a 
\b)b 

Assume Sg. 3 5^. Thus, Ng. > N^- A{a, b) can be thought of as an A^^ x A^^ matrix. 
Let its singular value decomposition be 



A{a,b)= J2 5^ f/(«,ai)/P(M^(«i=&i)V^^(&i,&) 

aies a bieSt 



(160) 



for a\\ a G Sg, b G Sb, where U and V are unitary matrices. Then we can express 

\'ip)a,b as 



' a,b 



IbiYb 



where 



l«i)'a = U{a,ai)\a)a 

aeSa 



for all ai E Sa and 



beSt 

for all bi G Sb- Eg. (1161 p is called the Schmidt Decomposition of 
Claim 16 If Pg^b € dm{l-ig^b) is pure, then 

S{a) = S{b) . 



(161) 



(162) 



(163) 



(164) 
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proof: 

Let 

Pa,l> = [ \^)a,b ][ h.c. ] . (165) 
If we express |V')a,fe as in Eq. fll6ip . then 

S{a) = H{P{h^)U,^s,=S{h) . (166) 

QED 

B Appendix: Partial Entropies of Pure Multi- Partite 
State 

In this appendix, we state some consequences of Claim [16] for the partial entropies of 
pure multi-partite states. 

Let a J = {aj)jej for any J C Zi^i\f. 

Claim 17 Suppose J is a nonempty subset of Zi^n and = Zi^p^ — J. If Pa^ is 
pure, then 



S{aj) = S{ajc 
For example, for N = 4, this means 



(167) 



S{a^, a^, Og, a^) = 0, 

0-2, fls) = S{a^) and permutations, . (168) 
'S'(aii, 0-2) — 'S'( cts, (I4) and permutations 

proof: This is just a generalization of Claim [TBI 
QED 

Claim 18 Suppose I, J are nonempty, disjoint subsets of Zi^j^ such that lUJ = Z\^^ . 

If Pa^ is a pure state, then 

S{aj\aj) = ~S{aj) = —S{aj) (169a) 
S{aj : a j) = 2S{aj) = 2S{aj) (169b) 

proof: Obvious. 
QED 
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Claim 19 Suppose /, J, K are nonempty, disjoint subsets of Zi^n such that J U J U 
K = jv. If Pa ^5 o pure state, then 

ttj) = S{ai^) — S{aj) (170a) 
S{aj\aj) = —S{aj\aj^) (170b) 
S{aj : a j) = S{aj) + S{aj) — 5'(a^) (170c) 
S{aj : aj\ax) = 'S'(aj : aj) (170d) 

proof: Obvious. 
QED 

Claim 20 Suppose I, J, K, L are nonempty, disjoint subsets of Z^ j^ such that lUJU 
K U L = Zi^iy. If Pa^^ AT ^'^ ^ ^^^^ state, then 

S{aj : ajla^) — S{aj\ax) ~ S{aj\ai) (171a) 
S{aj : aj\ax) = ~S{aj : ajja^,) (171b) 

proof: Obvious. 
QED 

C Appendix: RUM of Pure States 

In tliis appendix, I describe wliat I call the RUM (Roots of Unity Model) of pure 
states. The model only works for pure states, and even for those there is no guarantee 
that it will always give the right answer. That's why 1 call it a model. 

One famous physics "model" is the Bohr model of the Hydrogen atom. The 
Bohr model gives some nice intuition about what is going on, plus it predicts some 
(not all) of the features of the Hydrogen spectrum. 

The RUM of pure states gives some insight into why quantum conditional en- 
tropies S{b\a) can be negative unlike classical conditional entropies H{b\a) which 
are always non-negative. It also gives some insight into the identities presented in 
Appendix |B] for the partial entropies of multi-partite states. It "explains" such iden- 
tities as being a consequence of the high degree of symmetry of pure multi-partite 
states. 

Consider an A^-partite pure state described by random variables On, a 25 • • • ' ^ 
We redefine the random variables a • so that they equal the A^'th roots of unity: 
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a. = exp{z ' ) (172) 



,.2vr(j - 1) 

for j G 2'i 7v Let J be any nonempty subset of Zi^j^. Let J2 ^kj = Yliji^j ^j- We 
redefine the entropy of the A^-partite state as follows 



S{aj) = \Y,aj\ . (173) 

Note that the various subsystems a ^ contribute to this entropy in a coherent sum, 
instead of the incoherent sums that we usually find when dealing with classical en- 
tropy. 

Note that 

5^«j = -$^«jc (174) 



so 



S{aj) = S{aj.). (175) 

This identity was obtained in the exact case too, in Claim [T71 

Let J, K be two nonempty disjoint subsets of Zi^n. In this model 

S{aK\aj) = S{aj^, aj) - S{aj) = |^ a^uj| - \Y1 ' ^^'^^^ 

which clearly can be negative. 

From the triangle inequalities 

\<\Y.^j\ + \H^k\- (177) 

This can be re-written as 

\S{aj) - 5(a^)| < S{aj, a^) < S{aj) + S{a^) . (178) 
We recognize this as the Araki-Lieb inequality and subadditivity of the joint entropy. 
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